24 research outputs found

    An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks

    Get PDF
    Background: Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn). Results: We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. Conclusions: The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful

    Frontotemporal dementia: insights into the biological underpinnings of disease through gene co-expression network analysis

    Get PDF
    BACKGROUND: In frontotemporal dementia (FTD) there is a critical lack in the understanding of biological and molecular mechanisms involved in disease pathogenesis. The heterogeneous genetic features associated with FTD suggest that multiple disease-mechanisms are likely to contribute to the development of this neurodegenerative condition. We here present a systems biology approach with the scope of i) shedding light on the biological processes potentially implicated in the pathogenesis of FTD and ii) identifying novel potential risk factors for FTD. We performed a gene co-expression network analysis of microarray expression data from 101 individuals without neurodegenerative diseases to explore regional-specific co-expression patterns in the frontal and temporal cortices for 12 genes (MAPT, GRN, CHMP2B, CTSC, HLA-DRA, TMEM106B, C9orf72, VCP, UBQLN2, OPTN, TARDBP and FUS) associated with FTD and we then carried out gene set enrichment and pathway analyses, and investigated known protein-protein interactors (PPIs) of FTD-genes products. RESULTS: Gene co-expression networks revealed that several FTD-genes (such as MAPT and GRN, CTSC and HLA-DRA, TMEM106B, and C9orf72, VCP, UBQLN2 and OPTN) were clustering in modules of relevance in the frontal and temporal cortices. Functional annotation and pathway analyses of such modules indicated enrichment for: i) DNA metabolism, i.e. transcription regulation, DNA protection and chromatin remodelling (MAPT and GRN modules); ii) immune and lysosomal processes (CTSC and HLA-DRA modules), and; iii) protein meta/catabolism (C9orf72, VCP, UBQLN2 and OPTN, and TMEM106B modules). PPI analysis supported the results of the functional annotation and pathway analyses. CONCLUSIONS: This work further characterizes known FTD-genes and elaborates on their biological relevance to disease: not only do we indicate likely impacted regional-specific biological processes driven by FTD-genes containing modules, but also do we suggest novel potential risk factors among the FTD-genes interactors as targets for further mechanistic characterization in hypothesis driven cell biology work

    Gene co-expression networks shed light into diseases of brain iron accumulation

    Get PDF
    Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention

    CoExp: A Web Tool for the Exploitation of Co-expression Networks

    Get PDF
    Gene co-expression networks are a powerful type of analysis to construct gene groupings based on transcriptomic profiling. Co-expression networks make it possible to discover modules of genes whose mRNA levels are highly correlated across samples. Subsequent annotation of modules often reveals biological functions and/or evidence of cellular specificity for cell types implicated in the tissue being studied. There are multiple ways to perform such analyses with weighted gene co-expression network analysis (WGCNA) amongst one of the most widely used R packages. While managing a few network models can be done manually, it is often more advantageous to study a wider set of models derived from multiple independently generated transcriptomic data sets (e.g., multiple networks built from many transcriptomic sources). However, there is no software tool available that allows this to be easily achieved. Furthermore, the visual nature of co-expression networks in combination with the coding skills required to explore networks, makes the construction of a web-based platform for their management highly desirable. Here, we present the CoExp Web application, a user-friendly online tool that allows the exploitation of the full collection of 109 co-expression networks provided by the CoExpNets suite of R packages. We describe the usage of CoExp, including its contents and the functionality available through the family of CoExpNets packages. All the tools presented, including the web front- and back-ends are available for the research community so any research group can build its own suite of networks and make them accessible through their own CoExp Web application. Therefore, this paper is of interest to both researchers wishing to annotate their genes of interest across different brain network models and specialists interested in the creation of GCNs looking for a tool to appropriately manage, use, publish, and share their networks in a consistent and productive manner

    Incomplete annotation has a disproportionate impact on our understanding of Mendelian and complex neurogenetic disorders

    Get PDF
    Growing evidence suggests that human gene annotation remains incomplete; however, it is unclear how this affects different tissues and our understanding of different disorders. Here, we detect previously unannotated transcription from Genotype-Tissue Expression RNA sequencing data across 41 human tissues. We connect this unannotated transcription to known genes, confirming that human gene annotation remains incomplete, even among well-studied genes including 63% of the Online Mendelian Inheritance in Man–morbid catalog and 317 neurodegeneration-associated genes. We find the greatest abundance of unannotated transcription in brain and genes highly expressed in brain are more likely to be reannotated. We explore examples of reannotated disease genes, such as SNCA, for which we experimentally validate a previously unidentified, brain-specific, potentially protein-coding exon. We release all tissue-specific transcriptomes through vizER: http://rytenlab.com/browser/app/vizER. We anticipate that this resource will facilitate more accurate genetic analysis, with the greatest impact on our understanding of Mendelian and complex neurogenetic disorders

    An integrated genomic approach to dissect the genetic landscape regulating the cell-to-cell transfer of α-synuclein

    Get PDF
    Neuropathological and experimental evidence suggests that the cell-to-cell transfer of α-synuclein has an important role in the pathogenesis of Parkinson's disease (PD). However, the mechanism underlying this phenomenon is not fully understood. We undertook a small interfering RNA (siRNA), genome-wide screen to identify genes regulating the cell-to-cell transfer of α-synuclein. A genetically encoded reporter, GFP-2A-αSynuclein-RFP, suitable for separating donor and recipient cells, was transiently transfected into HEK cells stably overexpressing α-synuclein. We find that 38 genes regulate the transfer of α-synuclein-RFP, one of which is ITGA8, a candidate gene identified through a recent PD genome-wide association study (GWAS). Weighted gene co-expression network analysis (WGCNA) and weighted protein-protein network interaction analysis (WPPNIA) show that those hits cluster in networks that include known PD genes more frequently than expected by random chance. The findings expand our understanding of the mechanism of α-synuclein spread

    Regulatory sites for splicing in human basal ganglia are enriched for disease-relevant information

    Get PDF
    Genome-wide association studies have generated an increasing number of common genetic variants associated with neurological and psychiatric disease risk. An improved understanding of the genetic control of gene expression in human brain is vital considering this is the likely modus operandum for many causal variants. However, human brain sampling complexities limit the explanatory power of brain-related expression quantitative trait loci (eQTL) and allele-specific expression (ASE) signals. We address this, using paired genomic and transcriptomic data from putamen and substantia nigra from 117 human brains, interrogating regulation at different RNA processing stages and uncovering novel transcripts. We identify disease-relevant regulatory loci, find that splicing eQTLs are enriched for regulatory information of neuron-specific genes, that ASEs provide cell-specific regulatory information with evidence for cellular specificity, and that incomplete annotation of the brain transcriptome limits interpretation of risk loci for neuropsychiatric disease. This resource of regulatory data is accessible through our web server, http://braineacv2.inf.um.es/

    Identification of novel risk loci, causal insights, and heritable risk for Parkinson's disease: a meta-analysis of genome-wide association studies

    Get PDF
    Background: Genome-wide association studies (GWAS) in Parkinson's disease have increased the scope of biological knowledge about the disease over the past decade. We aimed to use the largest aggregate of GWAS data to identify novel risk loci and gain further insight into the causes of Parkinson's disease. / Methods: We did a meta-analysis of 17 datasets from Parkinson's disease GWAS available from European ancestry samples to nominate novel loci for disease risk. These datasets incorporated all available data. We then used these data to estimate heritable risk and develop predictive models of this heritability. We also used large gene expression and methylation resources to examine possible functional consequences as well as tissue, cell type, and biological pathway enrichments for the identified risk factors. Additionally, we examined shared genetic risk between Parkinson's disease and other phenotypes of interest via genetic correlations followed by Mendelian randomisation. / Findings: Between Oct 1, 2017, and Aug 9, 2018, we analysed 7·8 million single nucleotide polymorphisms in 37 688 cases, 18 618 UK Biobank proxy-cases (ie, individuals who do not have Parkinson's disease but have a first degree relative that does), and 1·4 million controls. We identified 90 independent genome-wide significant risk signals across 78 genomic regions, including 38 novel independent risk signals in 37 loci. These 90 variants explained 16–36% of the heritable risk of Parkinson's disease depending on prevalence. Integrating methylation and expression data within a Mendelian randomisation framework identified putatively associated genes at 70 risk signals underlying GWAS loci for follow-up functional studies. Tissue-specific expression enrichment analyses suggested Parkinson's disease loci were heavily brain-enriched, with specific neuronal cell types being implicated from single cell data. We found significant genetic correlations with brain volumes (false discovery rate-adjusted p=0·0035 for intracranial volume, p=0·024 for putamen volume), smoking status (p=0·024), and educational attainment (p=0·038). Mendelian randomisation between cognitive performance and Parkinson's disease risk showed a robust association (p=8·00 × 10−7). / Interpretation: These data provide the most comprehensive survey of genetic risk within Parkinson's disease to date, to the best of our knowledge, by revealing many additional Parkinson's disease risk loci, providing a biological context for these risk factors, and showing that a considerable genetic component of this disease remains unidentified. These associations derived from European ancestry datasets will need to be followed-up with more diverse data. / Funding: The National Institute on Aging at the National Institutes of Health (USA), The Michael J Fox Foundation, and The Parkinson's Foundation (see appendix for full list of funding sources)

    Identification of Candidate Parkinson Disease Genes by Integrating Genome-Wide Association Study, Expression, and Epigenetic Data Sets

    Get PDF
    Importance Substantial genome-wide association study (GWAS) work in Parkinson disease (PD) has led to the discovery of an increasing number of loci shown reliably to be associated with increased risk of disease. Improved understanding of the underlying genes and mechanisms at these loci will be key to understanding the pathogenesis of PD. / Objective To investigate what genes and genomic processes underlie the risk of sporadic PD. / Design and Setting This genetic association study used the bioinformatic tools Coloc and transcriptome-wide association study (TWAS) to integrate PD case-control GWAS data published in 2017 with expression data (from Braineac, the Genotype-Tissue Expression [GTEx], and CommonMind) and methylation data (derived from UK Parkinson brain samples) to uncover putative gene expression and splicing mechanisms associated with PD GWAS signals. Candidate genes were further characterized using cell-type specificity, weighted gene coexpression networks, and weighted protein-protein interaction networks. / Main Outcomes and Measures It was hypothesized a priori that some genes underlying PD loci would alter PD risk through changes to expression, splicing, or methylation. Candidate genes are presented whose change in expression, splicing, or methylation are associated with risk of PD as well as the functional pathways and cell types in which these genes have an important role. / Results Gene-level analysis of expression revealed 5 genes (WDR6 [OMIM 606031], CD38 [OMIM 107270], GPNMB [OMIM 604368], RAB29 [OMIM 603949], and TMEM163 [OMIM 618978]) that replicated using both Coloc and TWAS analyses in both the GTEx and Braineac expression data sets. A further 6 genes (ZRANB3 [OMIM 615655], PCGF3 [OMIM 617543], NEK1 [OMIM 604588], NUPL2 [NCBI 11097], GALC [OMIM 606890], and CTSB [OMIM 116810]) showed evidence of disease-associated splicing effects. Cell-type specificity analysis revealed that gene expression was overall more prevalent in glial cell types compared with neurons. The weighted gene coexpression performed on the GTEx data set showed that NUPL2 is a key gene in 3 modules implicated in catabolic processes associated with protein ubiquitination and in the ubiquitin-dependent protein catabolic process in the nucleus accumbens, caudate, and putamen. TMEM163 and ZRANB3 were both important in modules in the frontal cortex and caudate, respectively, indicating regulation of signaling and cell communication. Protein interactor analysis and simulations using random networks demonstrated that the candidate genes interact significantly more with known mendelian PD and parkinsonism proteins than would be expected by chance. / Conclusions and Relevance Together, these results suggest that several candidate genes and pathways are associated with the findings observed in PD GWAS studies

    A systems-level analysis highlights microglial activation as a modifying factor in common epilepsies

    Get PDF
    Aims: The causes of distinct patterns of reduced cortical thickness in the common human epilepsies, detectable on neuroimaging and with important clinical consequences, are unknown. We investigated the underlying mechanisms of cortical thinning using a systems-level analysis. // Methods: Imaging-based cortical structural maps from a large-scale epilepsy neuroimaging study were overlaid with highly spatially resolved human brain gene expression data from the Allen Human Brain Atlas. Cell-type deconvolution, differential expression analysis and cell-type enrichment analyses were used to identify differences in cell-type distribution. These differences were followed up in post-mortem brain tissue from humans with epilepsy using Iba1 immunolabelling. Furthermore, to investigate a causal effect in cortical thinning, cell-type specific depletion was used in a murine model of acquired epilepsy. // Results: We identified elevated fractions of microglia and endothelial cells in regions of reduced cortical thickness. Differentially expressed genes showed enrichment for microglial markers, and in particular, activated microglial states. Analysis of post-mortem brain tissue from humans with epilepsy confirmed excess activated microglia. In the murine model, transient depletion of activated microglia during the early phase of the disease development prevented cortical thinning and neuronal cell loss in the temporal cortex. Although the development of chronic seizures was unaffected, the epileptic mice with early depletion of activated microglia did not develop deficits in a non-spatial memory test seen in epileptic mice not depleted of microglia. // Conclusions: These convergent data strongly implicate activated microglia in cortical thinning, representing a new dimension for concern and disease modification in the epilepsies, potentially distinct from seizure control
    corecore